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1.  Introduction  and  summary 

In  1959  Chernoff  [7]  initiated  the  study  of  the  asymptotic  theory  of  sequential 
Bayes  tests  as  the  cost  of  observation  tends  to  zero.  He  dealt  with  the  case  of  a 
finite  parameter  space.  The  definitive  generalization  of  the  line  of  attack  initiated 
in  that  paper  was  given  by  Kiefer  and  Sacks  in  [13].  Their  work  as  well  as  that 
of  Chernoff,  the  intervening  papers  of  Albert  [1],  Bessler  [3],  and  Schwarz  [19], 
and  the  subsequent  work  of  the  authors  [4]  used  implicitly  or  explicitly  the 
theory  of  large  deviations  and  applied  only  to  situations  where  hypothesis  and 
alternative  were  separated  or  at  least  an  indifference  region  was  present. 

In  the  meantime  in  1961  Chernoff  [8]  began  to  study  the  problem  of  testing 
H :  6  ^  0  versus  K :  0  >  0  on  the  basis  of  observation  of  a  Wiener  process  with 
drift  9  per  unit  time  as  an  approximation  to  the  discrete  time  normal  observations 
problem.  Having  made  the  striking  observation  that  study  of  the  asymptotic 
behavior  of  the  Bayes  procedures  for  any  normal  prior  was  in  this  case  equivalent 
to  the  study  of  the  Bayes  procedure  with  Lebesgue  measure  as  prior  and  unit 
cost  of  observation,  he  reduced  this  problem  for  suitable  loss  functions  to  the 
solution  of  a  free  boundary  problem  for  the  heat  equation.  In  subsequent  work 
([2],  [9],  [10]  and  [16])  the  nature  of  this  solution  was  investigated  by  Chernoff 
and  others. 

In  this  paper  we  are  concerned  with  the  problem  of  testing  H :  6  ^  0  versus 
K:  6  >  0  by  sampling  sequentially  from  a  member  of  one  parameter  exponential 
(Koopman-Darmois)  family  of  distributions  (see  equation  (3.1))  at  cost  c  per 
observation.  We  will  assume  the  simple  zero-one  loss  structure  in  which  an  error 
in  decision  costs  one  unit  while  being  right  costs  nothing. 
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Our  main  result.  Theorem  4.2.  states  that  if  we  assume  a  bounded  continuous 
prior  density  i J)  on  the  parameter  space  and  that  an  observation  has  mean  zero 
and  variance  one  if  6  =  0,  then  our  problem  is  asymptotically  equivalent  to  the 
analogous  Wiener  process  problem  with  drift  9  per  unit  time,  the  same  loss  and 
cost  structure  and  prior  ‘‘density "  =  Chernoff  's  observation  applies  here 

also  and  this  asymptotic  problem  is  equivalent  to  the  problem  for  fixed  cost.  A 
formal  result  in  this  direction  was  obtained  for  the  special  case  of  Bernoulli 
trials  by  Moriguti  and  Robbins  [18].  Our  technique  may  be  viewed  as  an  ex¬ 
tension  to  the  sequential  case  of  an  approach  of  Wald  [21]  and  LeCam  [14]. 
It  is  clearly  applicable  to  other  testing,  estimation,  and  general  decision  problems. 

We  begin  by  examining  the  Wiener  process  problem  and  the  embedded  discrete 
time  normal  observation  problem  for  a  general  continuous  and  bounded  prior 
density  if/.  Our  first  two  results.  Lemmas  1.1  and  2.2.  establish  the  asymptotic 
relation  between  the  Wiener  process  problem  with  prior  density  if/  and  the  same 
problem  with  prior  density  =  ip(0).  Our  basic  tool  is  the  similarity  transform 
used  by  Chernoff  in  [8]  and  a  weak  compactness  theorem  which  is  a  special  case 
of  an  unpublished  result  of  LeCam.  A  statement  and  proof  of  the  latter  for  our 
special  case  is  given  in  the  Appendix  (Theorem  A.l ).  The  validity  of  this  result 
requires  the  use  of  randomized  procedures.  These  are  employed  throughout  the 
paper,  despite  the  fact  that  the  Bayes  procedures  for  all  our  problems  are  non- 
randomized.  Randomization  also  plays  an  important  role  in  considering  the 
relation  between  the  discrete  and  continuous  time  problems  where  we  make 
heavy  use  of  sufficiency.  Reference  to  Chapter  7  of  Ferguson  [12]  may  prove 
helpful. 

In  Section  3  we  show  essentially  that  the  exponential  family  problem  is 
asymptotically  at  least  as  hard  as  the  Wiener  process  problem.  To  do  this  we 
successively,  without  substantial  loss,  reduce  the  problem  to  one  in  which  obser¬ 
vation  is  carried  out  in  blocks,  the  parameter  space  is  shrunk  to  a  neighborhood 
of  zero,  and  the  time  of  observation  is  truncated.  At  this  stage  we  use  a  Berry- 
Esseen  type  bound  essentially  due  to  Petrov  [19]  to  show  that  the  normal 
approximation  is  valid  and  then  apply  the  results  of  Section  2.  This  approxi¬ 
mation  theorem  is  given  as  Lemma  3.3  and  its  proof  is  given  in  the  Appendix. 

Finally,  in  the  fourth  section  we  show  that  the  Wiener  process  problem  is  at 
least  as  difficult  asymptotically  as  the  exponential  family  problem.  In  doing  so. 
we  exhibit  implicitly  a  sequence  of  procedures,  independent  of  i j).  for  which  the 
bound  of  Section  3  is  achieved. 

Some  concluding  remarks  and  statements  of  open  problems  are  given  in  the 
last  section. 

2.  The  normal  theory  problem 

In  this  section  we  shall  describe  randomized  sequential  procedures  in  con¬ 
tinuous  and  discrete  time  and  derive  asymptotic  results  for  the  Wiener  process 
problem  and  its  discrete  time  approximations. 
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Let  G[0,  oo)  be  the  set  of  all  continuous  functions  defined  on  [0.  go)  such 
that  lim,^^  x(t)/t2  =  0  endowed  with  the  norm  la'll  =  supt|x(0|/(l  +  t2)-  The 
space  C  is  complete  separable  and  metric.  Let  $  denote  the  class  of  Borel  sets 
on  C[0  .  oo )  (the  product  sigma  field)  and  let  0&x  denote  the  Borel  field  generated 
by  the  maps  x  — *•  x(s)  for  0  ^  s  ^  t. 

LetQ  =  C^O,  oo)  x  [0,1],^  be  the  product  Borel  field  and  Qe,  —  oo  <  9  <  go. 
be  the  probability  measure  on  (Q,  srf)  such  that  the  stochastic  process  lb  and 
random  variable  U  given  by  lb  (x,  z)  =  x,  U (a-,  z)  —  z  are  independent  and 
respectively  a  Wiener  process  with  drift  6  per  unit  time  and  a  uniformly  dis¬ 
tributed  variable  on  [0,  1].  The  subscript  6  will  be  used  in  this  section  when 
calculating  expectations  with  respect  to  those  measures  or  related  measures  of 
the  discrete  time  problem.  We  are  interested  in  testing  H :  9  5$  0  versus 
K:  9  >  0  with  zero-one  loss  and  cost  c  per  unit  time.  A  sequential  procedure 
7i  =  (<5.  i)  for  this  problem  consists  of  a  randomized  stopping  time  t  and  a 
randomized  rule  d.  Rigorously  t  is  a  measurable  map  from  Q  to  [0.  oo)  such  that 
for  every  2  e  [0.  1]  and  t  5S  go  the  event  [t(-.  2)  <  /]  6  J(.  To  describe  S  we 
begin  by  defining  the  pre  t  field  This  is  simply  the  class  of  all  events  A  e 
such  that  for  every  2  e  [0.  1]  and  every  t  ^  00  the  2  section  of  A  n  [t  <  t~\. 
that  is.  {x:  (x.  2)  e  A  n  [t  <  /]}.  is  $t  measurable.  Given  t.  5  is  any  map  from 
Q  to  [0.  1]  which  is  measurable.  The  use  of  these  procedures  should  be  clear. 
Having  observed  V  =  2.  we  employ  t(-.  2)  and  on  stopping  reject  with  prob¬ 
ability  S(- .  z)  and  accept  otherwise. 

If  /  (d.  0)  is  our  zero-one  loss  function  we  write  the  contitional  risk  of  n  given 
0  for  observation  cost  c  as. 


(2.1)  K„(n.  c)  =  A’„[/(<5.  0)]  +  cEt( t) 

=  s(0)  E„(S)  +  [1  -  s(0)]£’„(l  -  S)  +  cEe(z). 


where  e(9)  =  1  if  0  ^  0  and  0  otherwise. 

Let  Mc  be  the  bi measurable  transformation  of  Q  onto  itself  given  by. 


(2.2) 


i/c(a*.  z)(t) 


— ^  x(C-t).  2 
-\/C 


This  is  the  similarity  transformation  suitable  for  this  problem.  Then. 

(2.3)  p9m;x  =  P„vC. 

Mc  induces  a  mapping  of  the  space  of  decision  procedures  onto  itself  as 
follows : 

(2.4)  Mc  .  n  — ►  nc  =  (tc.  Sc) 
where 

(2.5) 

(2.6) 


TC(X,  2)  =  Ct[4/c(x.  2)]. 

5c(x.  z)  =  d[Mc(x.  2)]. 
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Then, 

(2.7)  Et(dc)  =  Eejc(5),  E„(  tc)  =  c£#Vt(  t), 
so  that 

(2.8)  Re{nc,  1)  =  Rdyjc{n,  c). 

Let  i]/  be  any  nonnegative  measurable  function  on  R.  Define 

(2.9)  R{il/,c)  =  inf  f  Re(n,  c)il/(0)  d8. 

n  J  -  00 

Lemma  2.1. 

(2.10) 

Vc 

Proof.  By  (2.8), 

(2.11)  f  Rg{n,  c)\J/(0)  dd  =  ^fc  f°°  Rg^c  {n,  c)\jj(eJ~c)  dQ 


=  %/c  f  ^(7Cc.  1)^(0^)^- 

J  —  00 


Since  the  correspondence  between  n  and  nc  is  one  to  one  onto,  the  result  follows 
by  taking  the  infima  over  n  on  both  sides. 

All  limits  in  the  sequel  are  taken  as  c  -*  0. 


Lemma  2:2.  Let  if/  be  as  above,  bounded  and  continuous  at  zero.  Then, 

(2.12)  lim— c)  =  ^(0)i2*(l) 

Vc 


where  jf2*(l)  =  R(  1,  1)  =  inf„  Rg{n,  1)  d6. 

Proof.  Note  that  R*(  1)  is  finite  (see,  for  example,  the  procedure  of  [5]). 
By  Lemma  1.1,  our  hypothesis,  and  the  dominated  convergence  theorem,  we 
must  have, 


(2.13) 


lim  —^=  R(\{/,  c)  =  lim  R[\l/(- y/c),  1] 


^  \I/(0)R*{1). 

On  the  other  hand  by  Theorem  A.l  there  exists  a  procedure  n(c)  such  that 
R{\j/{- yfc),  1)  =  ^  Rg(n,  imOyTc)  d®-  Further  given  any  sequence  c„  J,  0 

there  exists  a  procedure  n  and  a  subsequence  {%}  such  that, 

(2.14)  Rg{n,  1)  ^  lim  Re(it(cnk),  l). 
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Then  by  Fatou’s  lemma,  and  the  continuity  of 

(2.15)  lim  inf  R\\j/{- ,  y/c^k),  1]  ^  ^(0)  f°°  Re(n,  1 )  dO 

k  J  -  oo 

^  i//{0)R*(l). 

The  lemma  follows. 

Consider  our  problem  with  the  modification  that  if  you  sample  beyond  time 
T  then  there  is  no  terminal  loss  and  no  cost  for  additional  sampling.  Let 
Rg(n,  c,  T)  denote  the  conditional  risk  given  9  for  the  modified  problem. 
Formally, 

(2.16)  5,(71,  c,  T)  =  E,(SI[t  £  T])e(6)  +  [1  -  «(©)] 

5,[(1  -  (5)/[t  S  T]]  +  c£,[min  (t,  71)] , 

where  I[A]  is  the  indicator  of  A. 

Given  any  procedure  n,  let  nT  =  (ST,  tt)  and  a  truncation  of  n  be  defined 
by  tt  =  min  (t,  T),  St  =  S  if  t  <  T  and  ST  minimizes  the  posterior  Bayes  risk 
given  &tT. 

Let 

(2.17)  £*(1,  T)  =  inf  f*  Re(nT,  1)  d6, 

*  J-00 

(2.18)  5*(1,  T)  =  inf  f°°  Re{n,  1,  T)  dS. 

n  J- 00 

Lemma  2.3.  Let  Kc  -*  oo  as  c  -*  0,  and  let  \J/{9)  be  as  in  Lemma  2.2.  Then 

(2.19)  lim  inf—^z  j  Rg(n,  c,  — \  i^( 9 )  d9  =  il/(0)R*(l,  T), 

n  Jc)-kcjc  \cj 

(2.20)  lim  inf  — L  j  R0(nT/c,  c)\J/{9)  d9  =  \I/(0)R*{1,  T). 

V  c  J  ~ 00 

Proof.  By  arguing  as  in  the  proof  of  Lemma  2.1  we  have 

(2.21)  inf— R0(tz,  c,  \  d9  =  inf  |*  R0(n,  l,  T)\j/(9y/c)  d9. 

By  arguing  as  in  the  proof  of  Lemma  2.2,  we  get  that  the  right  side  of  (2.21 )  con¬ 
verges  as  c  -*  0  to  the  right  side  of  (2.19)  which  proves  (2.19).  Exactly  the  same 
type  of  arguments  prove  (2.20)  which  completes  the  proof. 

Lemma  2.4. 

(2.22)  lim  R*(l,  T)  =  R*(l), 

T-*  oo 

lim  £*(1,  T)  =  i2*(l). 

T-*  oo 


(2.23) 
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Proof.  Clearly  we  have 

(2.24)  R*(\,T)  ^  R*(l)  ^  R*{l,T). 

By  a  weak  compactness  argument  there  exists  for  fixed  T  a  n(T )  such  that 
R*{  1,  T)  =  R(n ,  1,  T).  Hence 

(2.25)  R*(  1,  T)  -  R*{\,  T)  =  R*{  1,  T)  -  R{n,  1,  T) 

^  R(nT,  1)  -  R (ft.  1,  T) 

^  f°  Pe{W(T)  >  0}d6 +[*  Pe{W(T)  <0}d6. 

J  —  oo  Jo 

The  right  side  of  (2.25)  converges  to  zero  as  T  — ►  oo  which  completes  the  proof 
of  the  lemma. 

Before  giving  our  final  lemma  we  review  two  ways  of  defining  sequential 
procedures  for  discrete  time  problems.  Let  3C  =  7?°°  x  [0,  1]  be  the  product  of 
a  countable  number  of  copies  of  R  and  [0,  1],  and  let  Q)  be  the  Borel  field  on  this 
space.  A  randomized  stopping  time  t  is  now  a  measurable  map  from  3C  to  the 
natural  numbers  {0,  1,  2,  •  •  •  ,  oo}  such  that  the  event  [t(-,  z)  ^  n\  is,  for  every 
2  and  n,  measurable  with  respect  to  the  er-field  38*  generated  by  the  map 
(xl,  x2,  •  •  • )  — ►  (xx,  x2,  •  •  •  ,  xn)  on  Rx .  We  shall  always  suppose  that  the  prob¬ 
ability  measure  on  Q>  is  such  that  the  random  sequence  (X1;  X2,  •  •  •)  and  the 
random  variable  U  given  by  (Xl.X2,---)(xl,x2 ,  ■  •  ■  ,  z)  =  (x2 ,  x2 ,  •  •  • )  and 
U(x1 ,  x2,  •  •  •  ,  z)  =  2  are  independent  and  U  is  uniform  on  [0,  1],  Similarly  a 
decision  rule  S  is  any  measurable  map  from  9C  to  [0,  1]  which  is  measurable  with 
respect  to  3ft x  the  ij/ -field  of  all  events  A  e  such  that  the  z  section  of  A  n  [t  ^  w] 
is  in  38*  for  every  n  and  z. 

In  this  formulation  (which  we  refer  to  as  I)  a  procedure  n  =  (S,  t)  has  the 
same  interpretation  as  in  the  continuous  time  problem.  On  the  other  hand,  fol¬ 
lowing  Ferguson  [12]  we  can  define  a  stopping  rule  x  by  a  sequence  of  functions 
(i/^0,  i l/lt  \j/2,  •  •  •)  where  t/q  is  a  38*  measurable  function  from  i?°°  to  [0,  1]  and 
LJLo^l.Ift  is  a  stopping  time  in  the  sense  of  (I),  then  the  i/q  are  given  by 

(2.26)  \l/j(xl,x2,  •  •  •)  =  A[z:  x{x2 ,  x2,  ■  •  •  ,  z)  =  j], 

where  X  is  Lebesgue  measure.  Conversely,  it  is  a  well-known  result  of  Wald  and 
Wolfowitz  [23]  that  given  a  stopping  time  in  this  second  mode  as  (ip0,  t •  •  •) 
there  is  a  stopping  time  in  the  sense  of  I  satisfying  (2.26)  (see  the  proof  of 
Theorem  A.l ).  Similarly,  a  terminal  decision  rule  is  specified  in  the  second  mode 
as  a  sequence  (<50,  <5t,  •  •  •)  of  functions  from  7?00  to  [0,  1]  such  that  8j  is  measur¬ 
able  38*.  Again  given  <5  of  type  I, 

(2.27)  SjiXy,  x2,  ■  ■  •)  =  j  S{xi ,  x2 ,  •  •  •  ,  z)  dz 

and  by  [23]  to  any  policy  ((<50.  <5j ,  •  •  •),  (iAo>  *Ai  •  ‘  '  ' ))  there  corresponds  a  policy 
n  =  (d,  t)  satisfying  (2.26)  and  (2.27).  Now  suppose  that  Qe  (abusing  notation) 
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makes  the  X,-  in  (Xj ,  X2,  •  •  •)  independent  normal  random  variables  with  mean 
6  and  variance  one.  If  n  is  a  policy  as  above  we  write  the  conditional  risk  given  6 
for  the  usual  sequential  testing  problem  as 

(2.28)  R0(n,  c)  =  e(0)Ee(S)  +  [1  -  e(0)']Ee(  1  -  5)  +  cEe( z) 

00  00 

=  fi(0')  I  E'WjSj)  +  [I  -  e(0)]  X  Ee[iPj(  1  -  Sj)] 

7=0  7=0 

+  c  Z  jEoi'I'j), 

7  =  0 

if  •A;  —  1  a  s.  Qd,  and  =  oo  otherwise.  We  shall  refer  to  this  as  the  dis¬ 
crete  time  normal  problem.  Evidently  any  policy  n  as  above  for  the  given  Qd 
may  be  considered  as  a  policy  in  Wiener  process  problem  with  the  same  risk. 
We  shall  want  to  consider  the  normal  block  problem  in  which  we  are  permitted 
to  sample  in  blocks  of  size  N  only  and  are  told  only  the  block  sums  SN  =  Ef L  x  X, , 
S2N  =  £f=  i  X(,  and  so  forth.  Of  course  statistically,  because  of  sufficiency,  this 
last  restriction  has  no  effect  on  the  difficulty  of  the  problem.  Let  &(SN,  S2 at, 
•  •  •  ,  SjN)  be  the  u-field  induced  on  i?00  by  the  maps  (^i,  x2,  •••)-»■  (Z?=i  xt, 
S f=N  + 1  %i,  ‘  ‘  *  ,  ^/  =  (j-i)N  +  i  xi)-  Formally  a  block  procedure  n  is  any  procedure 
in  the  discrete  time  problem  such  that  z  only  takes  on  the  values  0,  N,  2N,  •  •  •  , 
jN,  •  •  ■  with  probability  one,  for  every  z,j. 

(2.29)  [t(  ■.  z)  =  jN]  e  «(S„,  S2M,  ■■■.  SJfl), 
and  for  every  c,  z,j , 

(2.30)  [<5(- ,  z)  ^  c]  n  [t(- ,  z)  =  jA7]  e  @(SN,  ■■  ■  .  SJN). 

We  can  now  state : 

Lemma  2.5.  For  every  procedure  n  in  the  Wiener  process  problem  there  exists 
a  normal  block  procedure  n{N)  such  that , 

(2.31)  \R0{n.  c)  —  Re{n{N),  c)|  ^  Nc. 

Proof.  In  view  of  our  remarks  we  can  give  n{N)  in  the  second  formidation. 
Define 

(2.32)  <AjN)  =  0  for  j  ^  iN, 

(2.33)  =  Qe[(i  -  l)N  <  z  ^  iN/W(N),  W(2AT),  •  •  •  ,  JF(L\7)]. 

(The  /  indicates  a  suitable  version  of  the  conditional  probability.)  Note  that 
since  JT(iA7)  is  sufficient  for  &iN  ,  the  i Aiw*  may  be  chosen  independent  of  6.  Strictly 
speaking  iA^’  is  a  function  on  (7[0,  oo]  not  R co.  However  by  the  usual  arguments 
we  may  in  fact  take  (Aliv*  t°  be  a  function  of  the  variables  1T(A7),  W(2N ),  •  •  •  , 
W{iN)  only.  It  is  clear  that  if  the  definition  (2.33)  defines  a  stopping  time  at  all. 
it  will  be  a  block  time.  To  check  that  it  is  a  stopping  time  we  need  only  to  show 
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(2.34)  X  ^  1. 

j= o 

It  is  enough  to  show  that  E'jl  0  i JjjN)  ^  1  for  every  i.  We  have  (for  a  suitable  choice 
of  the  conditional  probability) 

(2.35)  1  ^  Qe\x  ^  iN/W(N),  W(2N),  •  •  •  ,  IF(iN)] 

iN 

=  <A(oV)  +  I  MO'  -l)N  ^  jN/W(N),  W(2N),  •  •  •  ,  If(iiV)] 

j=  i 

=  <A(oN)  +  #0[O  <  t  ^  N/IF(iV),  W(2N)  -  JF(iV), 

JF(3iV)  -  JF(iV),  •  •  •  ,  W(iN)  -  If(iV)] 

+  $0[iV  <  t  ^  2N/W(N),  W(2N),  If(3N)  -  W(2N), 

W{iN)  -  W(2N)] 

+  •  •  •  +  Qe\_{i  -  \)N  <z  ^  iN/W(N),  W{2N),  IT(3iV),  •  •  •  ,  IT(iiV)] 
=  <AoV)  +  gfl[0  <  t  ^  N/W(N)]  +  Q0[N  <  r  ^  2 N/W(N),  W(2N)] 

+  •  •  •  +  Qe[(i  -  1)N  <x  £  iN/W(N).  W(2N),  •  ■  •  ,  If(iN)] 

iN 

=  I  Vj N) • 

j—0 

Now  define  the  5\N)  by, 

(2.36)  $g>  <5$>  =  Ee{SI[(i  -l)N<x£  iN]/W(N),  •  •  •  ,  W(iN)} 

for  i  =  0,  1,  •  •  ■  , 

and  S\N)  =  0  otherwise. 

It  is  clear  that  n(N)  =  ((i/^N),  *  *  •),  (<5(0N),  •  •  •))  is  a  block  procedure  and, 

00 

(2.37)  X  Ee(il/jN)  SjN))  =  Ee(S) 

j= o 

while 

00  00 

(2.38)  T  jE^)  =  Z  W  £,(*!?’) 

j= 0  i= 0 

00 

=  I  iNg0[(i  -  l)iV  <  t  ^  iN] 

i  =  0 

=  E9{t)  +  N . 


The  lemma  follows. 
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3.  The  exponential  family  problem :  lower  bound 

In  this  section  we  introduce  the  exponential  family  model  and  derive  a  lower 
bound  to  the  Bayes  risk  of  the  testing  problem  in  terms  of  the  Wiener  process 
problem  of  the  previous  section.  Without  loss  of  generality  we  shall  throughout 
this  section  suppose  that  X1 ,  X2,  •  •  •  are  the  coordinate  projections  of  i?00  and 
are  thus  defined  on  the  space  of  the  previous  section.  We  let  Pd  be  a  probability 
measure  on  3C  wrhich  makes  the  X,  independent  and  identically  distributed  with 
density  fe,  with  respect  to  some  nondegenerate  c-finite  measure  y  on  R.  We  take 
f0  to  be  the  function 

(3.1)  /,(*)  =  ee*-b,e), 

where  9  ranges  over  a  set  0  such  that  zero  is  an  interior  point  of  0.  (As  in  the 
previous  section  whatever  be  9,  U  is  independent  of  the  X,  and  uniformly  distri¬ 
buted  on  [0,  i].)  Let  i p  be  as  in  Lemmas  2.3  and  2.4  a  bounded  probability 
density  (with  respect  to  Lebesgue  measure)  on  0  and  continuous  at  zero.  As 
before  wre  wish  to  test  H :  9  ^  0  versus  K:  9  >  0  with  zero-one  loss,  and  at  cost 
c  per  observation.  Evidently  the  definitions  of  sequential  procedure  introduced 
in  connection  with  the  normal  discrete  time  problem  are  appropriate  for  this  ex¬ 
ponential  family  problem  also,  the  only  difference  being  that  risks  must  be  cal¬ 
culated  under  P0  rather  than  Qe.  Since  we  shall  occasionally  have  to  talk  about 
both  problems  we  shall  use  the  superscripts  P,  Q  on  expectations  where  this  is 
necessary  to  avoid  ambiguity. 

Note  that 

(3.2)  E^(X ,)  =  b’(0),  Var^X,)  =  b"(9). 

We  shall  suppose  that  5(0)  =  6'(0)  =  0  and  6"(0)  =  1 .  The  general  case  reduces 
to  this  special  one.  To  see  this  consider  7,  =  [Xf  —  6'(0)]/[6"(0)]l/2.  The  7,  are 
a  sequence  of  observations  distributed  according  to  an  exponential  family  w  ith 
density 

(3.3)  ge(y)  =  exp  {0[6"(O)]1/2y  -  c{9)}. 

with  respect  to  a  suitable  underlying  measure. 

If  we  change  parameters  to  rj  =  5[6"(0)]1/2  we  are  back  in  the  previous  case 
although  this  does,  of  course,  give  the  prior  density  [6"(0)]1/2t/'{-  [6"(0)]  ~ 1/2}  for 
r\.  Also  note  that  there  is  no  loss  of  generality  in  assuming  that  the  Xf  are  real 
valued.  If  X  takes  vector  values  (or  even  abstract  values)  and  follows  a  one  para¬ 
meter  exponential  family  with  density  of  the  form, 

(3.4)  Mx)  =  e0,,x>-m> 

then  t(Xt),  t(Xl)  +  t(X2),  ■  •  •  is  a  sequence  of  transitive  sufficient  statistics  (see 
[12],  Chapter  7)  for  the  problem  and  of  course  /(XJ  is  a  random  variable  follow¬ 
ing  an  exponential  family  probability  law  of  the  original  form. 

For  any  procedure  n  (in  form  I  and  II)  define  Be( n,  c)  to  be  the  conditional 
risk  of  n  given  9.  Define  the  average  risk  of  n,  as  usual,  by 
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(3.5) 


B(n, 


Bg(n,  c)\j/{6)  d6, 


and  let  n*(c.  i ]/)  denote  the  Bayes  procedure  for  this  problem  which  minimizes 
B(n ,  c,  i J/)  over  all  n.  For  convenience  we  refer  to  these  procedures  as  n*  in  the 
sequel. 

We  shall  prove 

Theorem  3.1.  Under  the  conditions  of  this  section , 


(3.6) 


lim  inf 


B( Ti*.  c.  i p)  ^  ip(0)R*(l). 


The  proof  proceeds  by  a  series  of  lemmas.  Let  block  procedures  be  defined  as 
in  the  previous  section. 

Lemma  3.1.  For  every  n ,  there  exists  a  block  procedure  n(N)  such  that, 

(3.7)  | Be(n(N).  c)  -  Be{n.  c)|  ^  Nc 


for  every  0. 

Proof.  The  method  of  proof  is  the  same  as  that  of  Lemma  2.5.  Define 
n  =  ((>Ao.  1A1.  •  ■  •).  (^o.  ^i.  •  •  •)). 

(3.8)  *f>  =  0,  j  +  ih\  <>  =  £ 


(3.9)  SljpMjP  =  £  KF)[»j<Pj\ 

j  —  (i—  1)N  +  1 

Crucial  use  is  made  as  before  of  the  sufficiency  of  SiN  for  Pg  on  and  the 
independence  of  the  increments  of  the  Sn  process. 

For  any  n  let  Bg{n ,  T )  denote  the  conditional  risk  of  n  given  9  for  a  modified 
version  of  the  exponential  family  problem  in  which  there  is  neither  terminal  loss 
nor  additional  cost  of  observation  incurred  after  time  T.  Thus, 


(3.10)  Bg(n.  c.  T)  =  &{6)E(gP)(5I[r  ^  71])  +  cE(gP)[ min  (t.  T )] 

+  [i  -  S(0)]«r[(i  -  g  7']]. 


Let  Rg(n,  c,  T)  denote  the  same  conditional  expectation  when  the  observations 
come  from  the  normal  distribution  with  mean  9  and  variance  one,  that  is,  when 
the  expectation  is  taken  with  respect  to  Qe  rather  than  Pg.  We  shall  also  consider 
truncated  procedures  nT  defined  in  the  natural  way. 

Lemma  3.2.  For  every  n,  there  exists  a  block  procedure  n(N)  such  that, 

(3.11)  \Be(niN),  c,  T)  -  Bg{n,  c,  T)\  ^  Nc. 


Proof.  As  in  Lemma  3.1. 

Note  that  both  lemmas  apply  to  Rg,  Rd  as  a  special  case. 

Let  Pe  n  be  the  measure  corresponding  to  the  distribution  of  Sn  =  ^"=i  Xt 
where  the  Xt  are  independent  and  identically  distributed  according  to  fg.  Let 
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Q(£,  a2)  be  the  measure  corresponding  to  the  normal  distribution  with  mean  £ 
and  variance  a1.  Given  a  signed  measure  R  defined  on  a  cr-field  stf  let  \\R\\  = 
sup^  |i?h4)|.  Recall  that  if  P,  Q  are  probability  measures  dominated  by  a 
cr-finite  measure  y  then 


(3.12) 


I'-ol-sJ 


dP 

dp 


dQ 

dp 


dpi. 


We  need  the  following  lemma  which  may  be  derived  in  the  same  fashion  as  a 
known  result  of  Petrov  [19]. 

Lemma  3.3.  Let  8F  be  a  family  of  densities  ( with  respect  to  Lebesgue  measure) 
on  R.  Suppose  that  Zx,  •  -  •  ,  Zn  are  independent  and  identically  distributed  accord¬ 
ing  to  f  e  3F Let  Urn  be  the  probability  induced  by  n~112  Z"=  x  Xt  and  let  O  be  the 
standard  normal  measure  on  R.  Suppose  that  3F  satisfies  the  following  conditions: 


(i)  is  precompact  when  considered  as  a  subset  of  Lx  with  the  usual  topology ; 

(ii)  cx  )  =  sup  {f(x)  :  x  €  R,  f  e  «#”}  <  oo  ; 


Joo  fao 

xf(x)  dx  =  0.  x2f(x)  dx  =  1  for  every  f  e 

“00  J  —  oo 


(iv)  c2(&)  =  sup 
Then , 


{/- 


\x\  f{x)dx:fe3'>  <  oo. 


(3.13) 


sup  (|| Uf'„ 


-  <D||:/e  ^ 


c(Jzr) 

sjn 


Proof.  See  Appendix. 

Lemma  3.4.  There  exist  Kx ,  K2(M)  such  that 


(^•14)  ||$(5, c2)  $(o,i)||  =  -^i|£|  "f  K2 \a  1 


for  all  £  and  o2  such  that  \ o2  —  l|  5=  M . 

Proof. 

(3.15)  ||$(5,<r2)  —  $(0,1)11  =  || $ (4, <r2)  —  $(<5,1)11  +  ||$(^,1)  —  $(0.1)11 

=  |  $(0,  a2)  —  $(0,1)1  +  II  $(5,1)  _  $(0,1)1- 

By  (2.12)  for  £  >  0. 

(3.16)  ||  $<,*,  i)  —  $( o,  i)  II 

=  -  {  f" 2  [<£(*)  -  (j){t  -  G]  dt  +  f  [(j){t  -  C)  -  0(0]  dt 
2  I  J-oo  Jzi 2 


=  OG/2)  -  0(-£/2). 
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So,  in  general  we  get 

(3-17)  II <3,4. 1,  - 

Similarly  for  | a2  —  l|  ^  M2, 

(3-18)  \\Q{0,a2)  —  Q(o, i)||  =  Kz\ff2  ~  l|* 

From  Lemmas  3.3,  3.4  we  obtain 

Lemma  3.5.  Suppose  that  {Z,}  are  independent  and  identically  distributed 
with  density  ge  ( with  respect  to  Lebesgue  measure )  where ,  the  set  {ge :  |0|  ^  e} 
satisfies  conditions  (i),  (ii)  and  (iv)  of  Lemma  3.3  for  some  £  >  0,  and  further 

e(0)  =  Ee(Zl)  =  9  +  O(02) 

(3'19)  f(0)  =  V,(Z  i)  =  1  +  O(|0|)- 

Let  U0n  denote  the  distribution  of  Zt. 

Then  there  exists  a  8  >  0  and  constants  dt,  d2,  d2  such  that 


(3.20) 

Proof. 

(3.21) 


sup  {||£7fln  -  Q0n\\:  |0|  ^  8}  ^  ~=  +  d2\0\  +  d302  Jn. 


Uon  ~  $0„||  =  ||  U0n  —  Q(ne(0),  nv(0))  || 

4"  ||  Q(ne{0),  nv{0))  Q(n0,  n)  ||  • 


By  Lemma  3.3  and  our  assumptions  on  {g0:  \  0\  ^  <5}, 


(3.22) 


sup  {||^„  -  Q(„e(e),n,(e))|| :  |0|  S  ^ 


c(8) 


for  <5  ^  £. 

On  the  other  hand,  by  Lemma  3.4, 

(3-23)  ||Q(ne(0)  —  b)II  —  ||Q(Vn(e(0)-0),i>(0))  Q(  o  .oil 

g  AT(5)[V^|e{6»)  -9\  +  |t>(«)  -  1|] 

for  <5  sufficiently  small.  The  result  follows  by  (3.19). 

Remark.  If  p  is  dominated  by  Lebesgue  measure  and 


(3.24) 


sup  |e0JC  ^ :  1 6  |  ^  M ,  x  e  <  oo 


for  some  M  >  0,  then  we  may  apply  Lemma  3.5  to  the  exponential  family  and 
deduce  that 


(3.25) 


P»,„  -  <?,J  S  %  +  d2|0|  +  d^sfn 


if  |0|  ^  M  for  suitable  dly  d2,  d3. 
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The  following  result  is  well  known  and  is  stated  without  proof. 

Lemma  3.6.  Let  Pj ,  P2 ,  •  •  •  ,  P„ ;  Qx ,  Q2,  •  •  •  ,  Q„  be  probability  measures  de¬ 
fined  on  the  real  line  and  let  P(n),  Q(n)  be  the  corresponding  n  dimensional  product 
measures.  Then , 


(3.26) 


|i>(w  -  Qm  |  §  S  |  P,  -Oil- 


i  =  1 


Lemma  3.7.  Suppose  that  p  is  dominated  by  Lebesgue  measure  and 
sup  { e0x  dp/dx:  x  e  R,  |0|  ^  M }  <  oo.  If  |0|  ^  M,  then  for  any  Nc  ^  1 


(3.27)  max 


T 

Be\  n,  c,  —  )  -  Pfl(  n,  c,  — 

T_ 
cNr 


;  \Bd(nTic,  c)  —  Re(nT/c,  c)|} 


d  i 


^  2 cNc  +  —  (2  +  P)  ]—L=  +  d2\e\  +  d3e2y/Ne\. 


Proof.  We  give  the  argument  for  Be ,  that  for  Be  is  identical. 


(3.28) 


+ 


PJ  n,  c,  -  -  RA  n,  c,  - 


P0(  n{Nc),  c,  -  RA n,  c,  - 


BA  n(Nc),  c,  -)  -  BA  n,  c,  - 


+ 


Beln^\c,-)-Re  U»'\c,- 


By  Lemma  3.2  the  first  two  terms  on  the  right  side  of  (3.28)  are  each  bounded 
by  cNc.  Now, 


(3.29) 


T 


BA  niN'\  c,  -  -  PJ  n{Nc\  c,  - 


T 


^  2|^P)(<5(iVc)/[T(JVc)  ^  T/c ])  -  E^eQ)(d(Nc)I[ PNc)  ^  T/c]) | 


+  c 


P^P)^min^T(iVc),  — ^  —  P^G)^min^T(;Vc),  — ^ 


^  2||P 9S~1  -  C.g-1!!  +  T||P0lT 1  -  QeS-% 
where  S  maps  (xt,  x2,  •  •  •  ,  z)  into 


(3.30) 


Nc  2  Nc 

Z  Z  xn 


i  =  1  i  =  Nc+  1 

and/[c]  =  [ T/cNf\  +  1. 


IcNc 

Z  xi 

i  =  (Ic~  1)NC+  l 


Applying  Lemma  3.5  (and  the  following  remark)  and  Lemma  3.6  to  (3.29),  the 
result  follows. 

We  are  now  able  to  prove  Theorem  3.1.  We  begin  by  proving  the  theorem  in  the 
case  p  is  dominated  by  Lebesgue  measure  and  e0x  dp/dx  is  bounded  in  x  for  0 
in  some  neighbourhood  of  zero. 
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Let  Kc,  Nc  be  positive  numbers  to  be  determined  below.  We  have  by  the 
previous  lemmas  the  following  relations 

J00 

Be{n*.  c)iJ/(9)  dd 

—  00 

^  f  Be(n*.  c)iJ/{9)  dd 

J  -  KCy/~C 


(•KCJ7  (  T\ 

^  J  7i*,  c,  —  Jil/{9)  d9 

(“Kcyfc  (  T\ 

^  i?0(  7r*.  c.  —  ]\j/(9)  d9 

J  -KcJc  \  C  J 

rKc^c  f 

hcNc 
J  -KcvY  l 

+  djeViv^Lwrfe 


+  - (2  +  T) 

cN„ 


(jk + diW 


Now  since  yjj{9)  ^  F  (i J/(9)  is  assumed  to  be  bounded). 


(3.32)  ~=  j^|2CiVc  +  Jr  (2  +  T )  (^=  +  d2\d\  +  d392jN^  ij,(9)  d9 


*  2FKA2CN  +  d^±Il  +  ^2  +  Tld>K‘>/*  +  +  ^  ^ 


cN„ 


4W1/23 


cN 3/2 

The  right  side  converges  to  zero  for  Kc  =  c“1/8_3£,  Nc  —  c~7/8  +  4£,  and  0  < 
176e  <1. 

On  the  other  hand  considering  n*  as  a  procedure  for  the  Wiener  process  we 
have 

(3.33)  f  cv  R\  n*,  c,  —  )\l/(9)  d9  ^  inf  f  v  R\  n.  c,  —  ]iJ/(9)  d9 
J  —  Kcs/T  \  CJ  n  J-KcVc"  y  C) 


and  the  result  follows  by  Lemmas  2.3  and  2.4. 

To  prove  the  general  case,  that  is,  where  ^  is  not  dominated  by  Lebesgue 
measure  consider  the  following  problem. 

We  observe  Yx ,  Y2,  •  •  *  where 


(3.34)  Yt  =  (Xit  Qd 

with  Xt  as  before  and  {$*}  a  sequence  of  independent  identically  distributed 
normal  random  variables  independent  of  the  {X,}  with  mean  60  and  variance  e. 
Let  IT;  =  X{  +  Q{.  The  sequence  {L"=1  IT,}  is  sufficient  and  transitive  for  this  new 
problem.  The  W{  are  independent  identically  distributed  according  to  a  one 
parameter  exponential  family  of  the  form  (3.1)  with  b'( 0)  =  0,  b" ( 0)  =1+6. 
Furthermore  the  underlying  measure  n  of  this  new  family  satisfies  the  condition 
of  the  remark  following  Lemma  3.5. 
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If  we  let  Be(c,  \Jj)  be  the  Bayes  risk  of  the  best  procedure  for  the  new  problem 
when  the  cost  of  observation  (per  vector)  is  c  and  * J/  is  the  prior  density  on  9,  then 
our  initial  discussion  leads  to 


(3.35) 


lim  inf 

c-*  o 


BE(c ,  t j/) 


> 


>y 1 + £) 


R*(  1). 


Of  course,  B(n*.  c.  \Jj)  ^  Be(c.  i}/)  for  every  £  >  0.  The  theorem  follows. 


4.  The  exponential  family  problem :  upper  bound 

The  basic  result  of  this  section  is: 

Theorem  4.1.  Under  the  conditions  of  Section  2 


(4.1)  lim  sup  — -r=  B (ti* ,  c.  i/0  ^  \j/(0)R*{]). 

Vc 

In  fact,  there  exists  a  sequence  of  procedures  {tt**}  which  is  independent  of  i j/ 
such  that 


(4.2) 


lim  — B(n**.  c.  if/)  —  il/(0)R*(\). 

vc 


(. Dependence  on  c  in  n**  is  suppressed  for  brevity.) 

From  Theorems  4.1  and  3.1.  we  derive  immediately  our  main  result: 
Theorem  4.2.  Under  the  conditions  of  Section  2. 


(4.3) 


lim  B(n*,  c.  i A)  =  il/(0)R*(\ 


We  shall  give  the  proof  of  (4.1)  in  detail  for  the  case  where  y  satisfies  the 
conditions  of  the  remark  following  Lemma  3.5  and  sketch  the  additional  remarks 
needed  for  the  general  case  and  the  construction  of  n**  at  the  end. 

Proof.  Let  n  be  any  procedure  for  the  Wiener  process  problem  and  nT/c  be 
its  truncation  at  T/c  as  in  Section  2.  By  Lemma  2.5  there  exists  a  block  procedure 
n(j,cc]  which  by  construction  is  truncated  at  [T/c]  +  Nc  such  that 


(4.4)  | Rq ( ^rfc  •>  c)  ~  Rei^T/c’  c)|  ^  cNc- 

Consider  the  following  discrete  time  rule  which  we  shall  denote  by  n(e).  Take 
n{cl)  observations.  Stop  and  reject  H  if  E^X;;  1  ^  i  ^  w‘u}  >  A[1],  stop  and 
accept  H  if  E,{X,;  lgi^  w'1*}  <  -A™.  If  |Ef{Xi;  1  ^  i  £  w<1)}|  ^  A(f\  take 
n[2)  further  observations  and  stop  and  reject  H  if 

(4.5)  £  {Xy,  w*1’  +  1  ^  i  g  n[X)  +  n[2) }  >  A[2) 


stop  and  reject  H  if 

(4.6)  |X  {X,.;  n[l)  +  1  ^  i  ^  n[l)  +  n<2)}|  <  -A[2). 
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If 

(4.7)  |X  {X,.;  n(X)  +  I  ^  i  ^  n(X)  +  n{2)]  |  ^  A(2\ 


then  disregard  the  first  n +  w*1’  observations  and  follow  the  procedure  n^- 
Let  w*.1’  =  c_1/2  +  \  A[X)  =  c_1/4.  n(2)  =  c-3/4  +  3£,  A(2)  —  c_3/8  +  E.  and  Nc  = 
c-7/8  +  3«  where  i76fi  <  l.  In  that  case  for  absolutely  continuous  fj,  as  above,  we 
shall  show  that 


lim  sup  ~—j=  [B{n(e),  c,  if/)  —  R(nT/c,  c,  t/0]  ^  0. 
c 


(4.8) 


Given  (4.8)  it  follows  that 

(4.9)  lim  sup  — -=  B(n*,  c,  ip)  lim  sup  — inf  |  Rg(nT/c,  c)ip(6)  dO 
c  x/c  n 


=  ij/(0)R*(l,  T) 


by  Lemma  2.3.  An  application  of  Lemma  2.4  will  then  complete  the  proof  of 
Theorem  4.1.  To  begin  the  proof  of  (4.8)  note  that,  for  arbitrary  Kc, 


(4.10)  B{n(e),  c,  \p) 

=  f°°  Be(7i{e\c)iJ/(6)dO 

J  —  00 

Sc»ai+  f  >  Ai'wwde 

J  —  00  c 

+  r  ADS*--  <  -Al"]>He)d8 

Jo  1 

+  c«<2»  f”  p,[|s„,,.|  g  A['i'\<ji(e)de 

J  —  oo  c 

JO  f*  GC 

p.Ds,,..  >  Al2<2<H0)d0  +  />,[«„«  <  -A,2>}>H0)d0 

—  oo  c  »/0  c 

+  f  cV~  Be(n{j/Cc\  c)ip{6)  dO 
J  -Kc^c 

+  f  |  =  ^4(c2>]  (1  +  T  +  cN c)\p {6)  dO. 

J\e\>Kc^c 

Since  >  A~\  is  increasing  in  9  we  may  for  arbitrary  Hc  >  0  bound  the 

right  side  of  (4.10)  by 

(4.11)  cn[l)  +  Potion |  >  A<J)] 

+  cn[2){P_Hcyrc[S„^  ^  -A'cl)]  +  PHcAS*r  =  +  2FHcVc) 
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+  Po[|S„..|  >  42’]  +  Bt^,f,c)i(0)d0 

J -Kcsfc 

+  (1  +  T  +  cN,){PK^[Sn,„  g  A <2>]  +  P-^V  £  -.4<2>]}, 

where  F  is  our  bound  on  if/.  The  idea  now  is  to  show  that  for  suitable  choices  of 
Kc,  Hc  all  of  the  above  are  negligible  save  c)^(0)  dO  and  that 

this  expression  can  be  well  approximated  by  Reitfrjc’  c)if/(9)  d9.  We 

collect  the  estimates  we  need  in  three  propositions.  All  of  these  employ  the  well- 
known  inequality  (see,  for  example,  Chernoff  [6]), 


(4.12) 


Pe[£„  ^  Al  ^  min  E{0P)(e,{S”~A)) 

t  >  o 

=  min  c"IK*+®)-W]-*a 

tSO 


Proposition  4.1. 


(4.13) 

(4.14) 
Proof. 


lim  -K=  Po[|$„(n|  ^  =  0, 

v  c 

lim  -^=  Po[|£„(2)|  ^  A(c2)]  =  0. 
yJC 

We  prove  (4.13),  and  (4.14)  is  argued  similarly.  By  (4.12), 


(4.15)  log  P0  [£„<.>  ^  A<ir]  =  min  {n[l)b{t)  -  tA <1}}. 

c  t>  o 

Since  6(0)  =  6'(0)  =  0  and  6"(0)  =  1  for  t  sufficiently  small  b(t)  ^  § t2.  Take 
t  =  A(cX)/n(cX)  to  get 


(4.16)  logPoKu,  ^  4X)]  ^  -  -oo. 

Applying  a  similar  argument  to  log  P0  [$„<»>  =  41*],  the  result  follows. 
Proposition  4.2.  If  Hc  =  c-1/4_2e, 

(4.17)  lim  -L  <*«’.?  S  -A?*]  =  0, 

vc 

(4.18)  lim  -L  c S  4‘>]  =  0. 

Vc 

Proof.  By  (4.12), 

(4.19)  log P-h.vcC'V 

=  minn^flX*  -  Hcy/c)  -  b{-HcJc)\  +  tA[X)}. 

For  c  sufficiently  small,  expanding  6  about  —  Hcy/c  and  6'  about  zero,  we  get 
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wMHc^c  -  4‘>N2 


log  [.S'„,n  2  -4*’]  g 


n 


(i) 


=  -ic~E{c~E  -  1 Y  ->  -00. 


The  result  follows  and  a  similar  argument  establishes  (4.18). 
In  an  entirely  analogous  fashion,  we  have 
Proposition  4.3.  If  Kc  =  c_1/8_3e, 


(4.21; 


(4.22) 


S  -.4<2>]  =  0. 

V  c 

lim-L/>  [SiKa  ^  42’]  =  o. 


As  a  consequence  of  Propositions  4.1  through  4.3,  to  prove  (4.8)  we  need 
only  show  that 

(4.23)  lim  sup  — \=  f  ^  [Be{n(j,cf,  c)  —  R0(nT/c.  c)\\j/(9)  dO  ^  0. 

.  ! r.  J  ~Kc^c 


(4.24)  -4=  |^0(7rr/c,  c)  -  c)\ip{6)  dO  ^  2 FKccNc  ->  0. 

.  /  r.  J  -  Kcv  C 


Now  in  view  of  (4.4) 
1  f*cV^ 
c 

V 

Finally, 

1  I 

c 


(4.25) 


i/f  (0) 


Be{n^\  c)  -  Re(n(f/Cc\  c) 

J  —  KCy/C 

g  2FK,  sup  jlB,^ ,  c)  -  c)| :  |0|  5  KcJi |  -»  0 


by  using  Lemma  3.7  and  the  estimates  (3.32).  Combining  (4.24),  (4.25)  and 

(4.23),  (4.8)  follows. 

In  the  general  case  proceed  as  follows.  Let  ^1.02;"'bea  sequence  of  random 
variables  (measurable  functions)  defined  on  the  unit  interval  such  that  if  we  put 
the  uniform  distribution  on  [0,  1]  the  Q{  are  independent  and  normally  distri¬ 
buted  with  mean  zero  and  variance  one.  We  may,  of  course,  think  of  the  Q{  as 
being  defined  on  3C,  depending  on  (aq,  x2.  '  *  •  ,  z)  through  2  only.  Define  n[e)  as 
follows;  n[e)  agrees  with  n(e)  for  the  first  two  stages  of  n{e).  If 


(4.26) 


I W; 


,<n 


ILU,;i  s  <  s  4‘>. 

i 

+  1  ^  ^  n +  n(c2)}\  ^  A[2\ 


then  apply  n{e)  to  the  sequence 


(4.27) 


X 


n<1)  +  nO)+  1 


+  Ql  > 


X„(i)  +  n(2)  +  2  +  Q 2>  ’  '  '  • 
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Formally, 

(4.28)  x2,  ■  •  •  ,  z) 

=  7te(ari ,  •  •  •  ,  x„(i)  +  „(2),  x„o)  +  na)  +  1  +  Q1(z),---). 

Arguing  as  before  but  now  applying  Lemma  3.7  to  the  variables 

(4.29)  Zt  =  (X„o)  +  na)  +  i  +  Qi) 

which  are  readily  seen  to  satisfy  the  condition  of  that  lemma,  we  find  that 

(4.30)  lim  sup  — B(n *,  c,  i jj)  ^  lim  sup  ~=.B(nEe),  c,  ip) 

c  x/c 


^  J\  +  £  ^(0)i?*(l,  T). 

Letting  T  — ►  oo  and  8  — ►  0  the  result  follows. 

To  construct  a  sequence  of  procedures  which  achieves  the  bound  a  slightly 
more  involved  argument  is  needed.  First  of  all.  arguing  as  before  in  Section  2, 
we  show  that  in  the  Wiener  process  problem  if  cNcKc  — *  0  and  Tc  -»  oo  then 

(4.31)  lim  —7=  fKcv^  R  =  if/(0)R*(l) 

\J C  "  —  Kcy/C 

where  n  is  such  that  R0(n,  1 )  dO  —  R*(  1).  Choose  Tc]  00  so  that  T2  c1/16_ lle 
—>  0,  and  consider  the  procedures 


(4.32) 


mr 


corresponding  to  defined  in  the  proof  of  Theorem  4.1  for  Tc  varying  as 
above.  It  is  easy  to  check  that  if  /i  satisfies  the  conditions  of  the  remark  following 
Lemma  3.5,  then 

(4.33)  lim  sup  -^=  f  {Be[(n  ffi)(e)»  c]  “  #o(^rc/c-  c'}|/'(0)  dQ  S  0. 

.  r  J  —  00 


If  /i  does  not  satisfy  the  conditions  following  Lemma  3.5  the  construction  is 
even  less  explicit.  We  construct  procedures  7r*e)  corresponding  to 


(4.34) 


n{Tclc,  EC) 


to  be  defined  below  with  variables  $jc)  which  are  independent  normal  with  mean 
zero  and  variance  ec  ->  0.  It  is  necessary  to  examine  the  proof  of  Lemma  3.5 
carefully  since  now  dx  will  depend  on  c  and  n.  It  is  easy  to  show  that  there  exists 
a  constant  d®  independent  of  n  such  that  if  Z,  =  Xt  4-  Q\c)  then 


dx{c)  ^  d\  — %=  exp  {  —  y2scn/2}, 


(4.35) 


anddx(c)  will  remain  bounded  above  for  n  =  Nc  provided  that  ec  ^  3  log  Nc/y2Nc, 
say.  For  Tc,  sc  as  above  we  have  for  any  sequence  of  procedures  {71} 
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lim  sup  ~^=  f  [Be(n£\  c)  -  Re,Ec(n,  c)]^(0)  dd  ^  0, 

yj  C  *>  ~  00 

where  Re  E  is  the  risk  of  n  for  the  problem  in  which  we  observe  the  Wiener  process 
with  drift  d  per  unit  time  and  variance  1  +  &  per  unit  time.  Finally,  it  follows 
from  the  results  of  Section  2  that 


(4.37) 


Ro,Ec(n,c)\J/(0)  dd  =  \I/(0)R*{1). 


Therefore  if  we  take 


(4-38) 

to  be  the  truncated  block  policy  corresponding  in  the  sense  of  Lemma  2.5  to 
the  procedure  nc  which  achieves  min^  R0  Ec(n ,  c)  dO  then 

(4.39) 


achieve  the  bound.  The  theorem  is  proved. 


5.  Concluding  remarks  and  open  problems 

The  techniques  of  this  paper  are  evidently  not  limited  to  the  zero-one  loss 
function  considered.  For  different  bounded  loss  functions  we  must  use  a  different 
similarity  transform,  make  different  choices  of  Kc,  Hc,  Nc ,  and  so  on,  obtain  a 
different  rate  of  convergence,  but  arrive  at  similar  results.  For  example,  if 
/(0,  d)  =  0  when  d  is  the  right  decision  and  if  /  (0,  d)  =  min  {\6\,  1 }  when  d  is  the 
wrong  decision,  then  the  Bayes  risk  of  our  problem  is  of  the  order  of  c2/3  and 
the  limiting  coefficients  of  c2/3  is  if/( 0)  times  the  Bayes  risk  of  Chernoff ’s  problem 
[8]  with  unit  cost  and  Lebesgue  prior.  We  can  also  treat  the  problem  of  testing 
with  shrinking  indifference  regions,  say,  of  the  form  [—  Ayfc,  B^fc]  for  zero- 
one  loss.  The  Bayes  risk  is  of  order  yjc  again  and  the  coefficient  is  ij/{ 0)  times  the 
risk  of  the  Wiener  process  problem  with  unit  cost,  Lebesgue  prior  and  indiffer¬ 
ence  region  [—A,  B~\.  On  the  other  hand  if  one  permits  to  vary  with  c,  say, 
t //c(t)  =  (l/y/c)\f/(t/y/c)  for  a  fixed  prior  density,  one  can  under  suitable  regu¬ 
larity  conditions  for  zero-one  loss  obtain  an  asymptotic  risk  of  order  yjc  with 
coefficient  the  risk  of  the  Wiener  problem  with  unit  cost  and  prior  density  i jj.  Of 
course  such  densities  presupposing  more  and  more  surety  that  the  parameter  is 
near  zero  with  decreasing  cost  are  not  usually  reasonable. 

It  seems  that  these  techniques  should  also  apply  to  other  decision  problems  for 
the  exponential  family  at  least  locally  and  should  prove  useful  in  non-Bayesian 
problems  as  well. 

The  result  may  also  be  generalized  to  nonexponential  families  by  considering, 
under  suitable  regularity  conditions  the  variables 

d  log  f9(X j) 
dd 


(5.1) 


e  =  o 
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To  what  extent  an  ambitious  program  such  as  that  of  LeCam  [14]  is  possible  in 
the  sequential  case  is,  however,  unclear  to  us  at  present. 

A  great  difficulty  of  the  asymptotic  theory  of  this  paper  is  that  in  general  it 
leads  to  problems  for  the  Wiener  process  which,  as  the  works  of  Chernoff  indi¬ 
cate,  can  be  solved  at  best  approximately.  In  fact,  from  a  (machine)  computa¬ 
tional  point  of  view  it  might  be  easier,  for  example,  to  try  to  calculate  the 
boundary  for  the  Bernoulli  process  as  an  approximation  to  the  Wiener  boundary. 
The  results  of  Moriguti  and  Robbins  [17]  as  well  as  our  paper  indicate  that 
such  “boundary  convergence”  as  in  Schwarz  [20]  should  hold.  However,  no 
proof  is  known  to  us. 

o  o  o  o  o 

APPENDIX 

We  retain  the  notation  of  Section  1.  Our  first  aim  is  to  prove  the  following 
weak  compactness  theorem. 

Theorem  A.l.  Let  nn  =  ( Sn ,  xn)  be  a  sequence  of  procedures  in  the  Wiener 
process  problem.  Then ,  there  exists  a  subsequence  { nk }  and  a  procedure  n  —  ( S ,  x) 
such  that , 

(A.l)  lim  E0(Snk)  =  Ee(S) 

k 

whenever  lim  supk^0(T„k)  <  oo  and 

(A. 2)  linynf  E0(t„k)  ^  E0(x) 

for  every  9.  (Eg  are  taken  with  respect  to  Qg  throughout.) 

The  proof  proceeds  by  a  series  of  lemmas. 

The  following  lemma  is  essentially  a  special  case  of  Wald's  theorem  [22]. 
Lemma  A.l.  Suppose  that  all  of  the  xn  have  common  finite  range  {<!<••• 
<  ts}.  Then  the  result  of  Theorem  A.l  holds  for  suitable  {%}  and  for  n  =  (S.  x) 
such  that  x  has  the  same  range  with  Q0  probability  one.  Furthermore ,  if  n'n  =  (<5),,  x'n ) 
is  another  sequence  of  procedures  with  x'n  having  the  same  range  and  x'n  ^  x„for 
all  n,  then  we  may  choose  {wk}  to  be  the  same  for  both  sequences  and  choose  the 
“ limiting ”  n'  =  (S',  x')  such  that  x'  ^  x. 

Proof.  We  write  the  ( bn ,  x„)  in  the  second  form  of  Section  3,  xn  —  (ip0n. 
•Ain,  •  •  •  »  <AS„h  &n  =  &1  n-  ‘  ■  O  wi*h 

(A. 3)  'l'in(x)  =  k[z  \  x„(x,  z)  =  tf] 

(A.4)  Sin(x)  =  f  Sin(x,  z)dz. 

J  0 

Apply  the  weak  compactness  theorem  (for  tests)  to  Ll(Q,  ddtj,  Q0)  (see 
Lehmann  [15],  p.  354)  and  the  diagonal  process  to  obtain  a  sequence  { nk }  and 
ddtj  measurable  functions  t J/j  measurable  functions  i lij.j—  !,•••,«  such  that 


78 


SIXTH  BERKELEY  SYMPOSIUM:  BICKEL  AND  YAHAV 


(A.5)  JJ  ^jnk{oc)g{x)Q0(dx,  dz)  -►  JJ \j/ j(x)g(x)Q0(dx ,  dz) 

for  every  g  which  is  88t.  measurable  and  such  that  Jj’|gr(x)| Q0(dx,  dz)  <  oo.  (The 
theorem  is  applicable  since  Q  is  a  complete  separable  metric  space.)  The  i f/j  are 
evidently  nonnegative.  Further,  if  g  is  measurable  and  Q0W~l  integrable, 


(A.6)  jj ll/Jnk{x)g(x)Q0(dx,  dz)  =  E0[i//jnii(W)g(W)~] 

=  E0{tinii(W)E0lg(W) I/!,,]}  -*  EoWimEoteiW)^,,-}} 


=  E0\^j(W)g(W)\ 

by  (A. 5). 

Therefore, 


(A.7)  (Jo 


Z  Ww)  >  1 

U=1 


=  Er 


Z  tj^w) 

Lj=  l 


Z  +AW)  >  1 

j=  i 


M  Z  *j(W)i 

j=  i 


Z  ^•(^)>  1 

L  j  =  i 


By  the  same  argument  ^0[2*-  =  i  i/q(  IT)]  =  1. 
Hence,  since  on  the  QgW~  1  are  equivalent 


(A. 8) 


& 


Z  =  i 

U=  1 


=  1. 


Evidently  we  may  choose  versions  of  the  t/q  such  that  i/q  ^  0  and  £*■=  j  i/q  =  1 
for  all  9.  Finally  we  conclude  that  $  =  (i/q  ,  •  •  •  ,  ^s)  is  a  stopping  time  and 


(A.9) 


^fl(T)  =  Z  ljE^j) 


j=  1 


=  I  ^„{*,(HOexp|W(<j) 

j-  i 

=  lim  f  <j£:o{^(ir)exp[0H'UJ)  -  i«2(J} 

fc  _,=  ! 


=  lim  Ee(xnk). 

k 


Now  we  can  by  diagonalization  and  a  similar  argument  obtain  a  further  sub¬ 
sequence  {nk}  and  0&t.  measurable  functions  y j  such  that, 

(A. 10)  E0[&Jn(W)^ln[W)g(W)\  ^  E^mgW] 

for  every  integrable  function  g  on  C.  Let, 

(A. 11)  Sj  =  yip,. 
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Since, 

(A. 12)  &’o[/l,'(W)9(tn]  S  E0[il/j(W)g(W)] 

for  every  integrable  g,  we  can  select  Sj  so  that  0  ^  Sj  ^  1  and,  of  course,  Sj  is 
&tj  measurable.  Evidently,  ((( f/1,  •  •  •  ,  i J/s),  (dx,  •  •  •  .  c)s))  a  policy  in  the  second 
form  and  {nk}  satisfy  (A.l)  and  (A. 2).  To  obtain  the  procedure  in  form  I  simply 
define  (following  Wald  and  Wolfowitz  [23]), 

tr~  1  f>- 

z(x.  z)  =  tr  if  £  <  z  ^  X  'I'M')- 

j= i  j=  i 

/  A  1  O  \ 

S(x ,  z)  =  (5j-(x)  on  the  set  [rfr,  z)  =  /•]. 

Since  t„  ^  the  statement  of  the  lemma  leads  to  limiting  times  (in  the  second 
form)  with  Zj-_  Mj{x)  ^  ’L^=liJ/j(x)  f°r  every  /  and  a:  and  our  second  assertion 
follows  from  (A. 13).  The  lemma  is  proved. 

Lemma  A. 2.  The  theorem  is  valid  if  it  is  true  that  there  exists  a  T  such  that 
1  for  all  n.  Furthermore,  order  is  preserved  in  the  limit  as  in  Lemma 

A.l. 

Proof.  Consider  a  grid  0.  T/2m.  2T/2m.  •  •  •  .  T.  Define  t[m)  =  kT/2m  if 
(k  -  1  )T/2m  <  t„  ^  kT/2m  for  k  =  0,  1 ,  •  •  •  ,  2m. 

Let  7t(nm>  =  (z(„m>S„).  (Note  that  Sn  is  measurable.)  Then. 

(A. 14)  Ee(zM)  ~  E9(zh)  ^  T/2m 

and 

(A. 15)  RoMM)  -  Rofrn)  ^  TI2m. 

Extract  a  subsequence  { nk }  and  limits  in  the  sense  of  Lemma  A.l  r(m).  S{m)  for 
each  of  the  sequences  n'™'.  Since  z(nm)  ^  z{„m+l)  for  every  n.  we  may  suppose  that 
z(m)  ^  z(m+  u  for  every  m.  Let  z  =  limm  z(m).  Note  that  J^m)  e  1(  for  every 

m  and, 

(A. 16)  =  Hm  ^re¬ 
consider  the  functions  These  are  ^ru)  measurable  for  m  ^  j.  Extract  a 

subsequence  { mk }  by  the  diagonal  process  and  Jft0)  measurable  functions  SU)  such 
that 

(A. 17)  A’„[^">(ir,  U)gj(W.  C)]  -  £,[S0,<»'.  U)g]{W.  C7)]. 

for  every  gq  which  is  measurable  and  bounded  for  every  6.  This  follows  by 
the  weak  compactness  theorem  for  test  functions  applied  to  J#Tu)  successively 
since  the  Qe  are  all  equivalent  on  &xu)  and  the  space  Q  is  complete  separable 
metric.  By  construction  for  every  9  the  $(J)  form  a  martingale  and  in  view  of 
(A.  16)  and  by  the  martingale  convergence  theorem. 

(A. 18)  |Jr] 
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a.s.  Qe  for  every  6.  Let, 

(A. 19)  5 

Then  <5  is  measurable  and 

(A. 20)  E,0)  =  Ee(fr")  =  lim  Ee(l>M)  =  lim  £#(<5„r) 

k 

while 

(A. 21 )  Ee( t)  =  lim  E(T{mk))  =  lim  lim  Ee( z(nmk)) 

k  k  r  r 

^  lim  Ee{ t  ) 

r 

by  (A.4).  The  lemma  follows. 

We  complete  the  proof  of  the  theorem.  Given  let  (t(T),  3(T>)  be  the  limits 
guaranteed  by  Lemma  A. 2  for  a  subsequence  of  the  procedures  7 i(nT)  =  ( x(nT) ,  3(nT)) 
given  by 

rm  _  mjn 

\3  if  T  ^  T 

0  otherwise. 


(A. 22) 


3lT)  = 


By  Lemma  A. 2  we  can  find  a  subsequence  {nk}  which  works  for  every  T  =  l, 
2,  •  •  •  and  such  that  t(J)  ^  z(j  + 11  for  every  j.  Let  t  =  limy  t(j).  By  the  monotone 
convergence  theorem. 


(A. 23) 


Eg( T)  =  lim  Ee(zU))  ^  lim  inf  E0(znJ. 


Consider  the  sequence  <5(7).  Tracing  back  its  construction  via  Lemmas  A.l  and 
A. 2  it  is  easy  to  see  that  the  ordering  3(„j)  ^  (5‘7+1)  is  preserved  with  Q0  prob¬ 
ability  one  in  the  limit.  Let  3  =  sup7-  3ij).  Clearly  <5  is  measurable  and  by  the 
monotone  convergence  theorem. 

Eg(3)  =  lim  lim  Ee(3(nJk)). 


(A. 24) 

Therefore, 

(A.25) 


lim  sup|#e(<5)  -  Ee(3„k)\ 


^  lim  sup  lim  sup  Q0[ z„k  >  j] 
j  k 

rS  lim  sup  —  lim  sup  Ee{z„k)  —  0, 
j  j  k 

if  lim  sup  Ee( z„k)  <  oo.  The  theorem  follows. 

Proof  of  Lemma  3.3.  We  proceed  as  in  [18] 


U,,n  ~  ®|  =  \  |/.(0  -  *<<) | 


dt 


(A. 26) 
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where  /„  =  dUf  n{t)/dt  and  (f)  is  the  standard  normal  density.  By  the  Schwarz 
and  Minkowski  inequalities, 


A -27)  \\Ut',  -  *||  S  f  Jd  +  x) 


dx 


1/2 


J(1  +  x)2[f„{x)  -  (f)(x)]2  dx 

^  ^l^jj  “  4>{x)\2  dx | 

J  [ xfn(x )  —  x4>{x)\2  dx 


1/2 


where  C  is  a  numerical  constant.  Since  Cl(^)  <  oo  we  may  apply  the  Plancherel 
theorem  to  obtain 


(A. 28) 


J  [/»(*)  ~  </>(^)]2  dx  =  2 n 


dt , 


where  X{t)  =  eltxf(x)  dx.  Similarly, 


(A. 29)  j  \_xfn(x)  —  x(f){x)~\2  dx 


It  is  well  known  that 


(A. 30) 

(A. 31) 
for 

(A. 32) 

and 

(A.33) 


A„|  — 7=  |  -  e 

n 


-t2l  2 


< 


6'3{C’2('y)}{  |i|3  +  |(|2}e-'2'4, 


< 


CACz^n}  {|(|3  +  |,|4}e-.>/4 


f  < 


C^/n 


-  c\!2{spy 


< 


where  Ct  —  C5  are  numerical  constants. 

Finally  note  that  since  the  Riemann  Lebesgue  lemma  holds  uniformly  on 
compact  sets  of  Lx,  we  have 

(A. 34)  sup  {|A(0|:|<|  ^  CA/C\12 (#"),  f  e  =  C3(&)  <  1. 
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(To  prove  this  note  that  the  map  (/,  t)  -*  |A(J)|  is  continuous  on  Ly  x  [  —  00,00] 
with  X{  —  00)  =  2(  +  oo)  =  0.  Since  |A(<)|  <  Jj/(0|^  for  every  t  ^  0,  (A. 31) 
follows.)  Now, 

(A.35)  _  e-'2n  dt 


S  (CHn)Cl^)  J 

+  L 


|r|>C4nV2C-  1/2(Jir) 


(|*|3  +  \t\2}2  e-,2!1  dt 


e  '2/2  dt 


|  >C4nI''2C-  '/2(iO 


+  cr2{F)  f 

x"(t) 

*  |l|  >  C.n'^C" 

\yjnj 

^  Cn{Clmin  +  cr2(^)}  ^  C2(^)A 


since  C3  <  1.  A  similar  estimate  can  be  given  for  the  second  term  on  the  right 
of  (A.27).  The  result  follows. 
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